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1. INTRODUCTION 

With the widespread access and use of Internet and increased knowledge in digital literacy, posting 
and searching for job vacancies replaced the traditional methods of job searching. Online job portals are 
websites that provide for announcing job positions and make it possible to find job vacancies at your 
fingertips. They are aggregation of job vacancies from companies and resume of various applicants [1]. It 
serves as a way for posting, searching, selection of applicants applying to the advertise jobs [1], [2]. “Online 
job vacancy portals contain job offers for almost all occupations and skill levels. These platforms are a rich 
source of information about the skills and other job qualifications which are difficult to gather via traditional 
methods” [3]. They are potential data source for the analysis of labor market demand that is to identify, 
analyze and track skills requirements in the labor market [4]-[8]. The data published on online job 
advertisement websites has been increasingly significant area of research. Online job portals provides a 
platform on which demand and supply meet which could inform policy makers enabling cross-country 
comparisons [5]. 

The need for graduates with current skill set is of constant concern [9]. Due to the growth and rapid 
expansion of the IT sector and the introduction of new technologies it resulted in an abundance of job titles 
which requires current skill requirements [10], [11]. Hence, the skills of IT professionals need to be updated 
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and by doing so it must requires skills that are on demand [12]. With the changing job trends, IT 
professionals have better employment packages and job opportunities due to high demand for their 
knowledge and skills [13]. The technological job skill needs of business and industry are continually 
evolving, which presents a challenge to educators and students attempting to focus on the right skills to meet 
these changing needs [14]-[16]. Furthermore, according to [17] information technology (IT) is a rapidly 
changing field and while studies have done some time ago are useful and informative, it is necessary to 
continually gather information about education and employment. Recruiters and hiring managers look for 
prospective candidates who can start with his/her job that is required by his/her job position. However, the 
competencies acquired by the graduates in the academe don’t complement with the requirements set or what 
these companies are looking for. The demands for computer professionals are very high and the employers 
have a high qualification for ICT jobs. Hence, the IT competency model was formulated. The IT competency 
model [18] identifies the knowledge, skills, and abilities needed for workers to perform successfully in the 
field of IT. The IT competency model is represented by a pyramid and involves several tiers. The tiers are 
represented by a pyramid shape that competencies at the top are at a higher level of skill. Furthermore, the 
models shape the increasing specialization and specificity of proficiency covered that are needed by the 
industries. Its tiers are divided into blocks that represent competency areas (i.e., groups of knowledge, skills, 
and abilities), which are defined using critical work functions and technical content areas. Tiers 1 through 3 
represent the “soft-skills” and work readiness skills that most employers demand. Tier 2-Academic 
Competencies are learned in a school setting like cognitive functions and thinking styles and likely to apply 
to all industries and occupations. Tier 30-Workplace Competencies represent motives and traits, as well as 
interpersonal and self-management styles. They are generally applicable to a large number of occupations 
and industries. Tiers 4 and 5 are industry-specific competencies needed to create career lattices within an 
industry. The Employment and Training Administration’s IT model does not include Tier 5 competencies. 
Included in this category are occupation-specific skills requirements and management competencies. This 
study will adopt the model which listed some IT knowledge, skills, and abilities. It intends to analyzed and 
discover the relationships between on-demand skills and advertised jobs online through associative rule 
analysis. Associative rule as cited in the study of [19], it is used to find associations between items or item- 
sets and association rules. Some of the published researches dealt with extraction of information on skills 
demand, identification of skills on-demand, discovery, and analysis of job requirements by implementing 
data mining techniques such as association rules applying them to publicly available data sites. Furthermore, 
textual analysis can be carried out with the combination of association rules and ontology mining [20]. 

The study of [21] used web mining to extract online job advertisement in a search engine to build a 
professional profile and compared with ones present in the official classification systems. The association 
rules are then classified based on the kind of jobs and also based on the kinds of qualifications. [22] collected 
from ICT job vacancy portals and analyzed and applied association rules to determined the relationship of 
ICT skills and careers. In the paper of [23], they introduced a methodology of identifying skills demand 
through public access of job vacancy using of web and text mining tools. They are able to extract valuable 
facts about competences and abilities sought by employers. The paper of [24] used weighted association rules 
to analyze the job requirement for IT field and obtained the relationship of job requirements and computer 
skills. Other study analyzes that implemented other machine learning techniques also conducted by [25] they 
demonstrated data mining techniques such as classification with k-NN textual and information extraction 
from textual dataset to uncover knowledge through public access job vacancies. The research proposed an 
approach that allows for identifying occupations and labor market demands within a given the job vacancy 
dataset. [26] analyzed job qualifications from a large set of data for choosing career and professional goals. A 
survey is undertaken to collect and prepare data about employment. The research then employed a-priori 
algorithm to discover the frequent itemsets and the association rules. Another study [27] applied Apriori 
algorithm of the association rule and used recommendation techniques based on the output of the skill 
association to determine the most sought after IT skills in the industry. The proposed method is also find skill 
combinations that are prominent in job advertisements. 

In the published research of [28], they analyzed current labour market demands for organizational 
and end-user information systems professionals based on an analysis of job advertisements for the online job 
portals. They analyze and categorize demands using the job responsibilities and knowledge and skill 
requirements specified in these advertisements. Nesterenko [29] analyzed the Russian IT job market in 
requirements extracted from job advertisements and in skills extracted from profiles of potential employees. 
They employed association rules to extract frequent combination of skills, characterizing job profiles. The 
study revealed two large groups of functional roles. Hierarchical clustering and association rules allowed to 
form nine clusters, which are closer to the professional fields. The results is considered interesting as they 
allow to discover flexible data-grounded job roles and skill patterns. Thus, it improve the skill matching 
approach to allow comparisons taking into account skills on different generalization hierarchy levels and 
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compare these results with the structure, based on the job advertisements skills. Hossain, Arafin, and 
Mohammad [30] proposed an intelligent system to recommend appropriate jobs to freelancers from different 
online job sites. They have use machine learning techniques to classifying jobs. They have used Apriori rule 
mining to derive frequent skill-sets used in completed jobs of freelancers. A possible job list is created to the 
freelancers by matching these frequent skill-sets with the skills required in the posted jobs. 

On the other hand, the research conducted by [31], analyzed and suggested a method to evaluate 
student’s programming skills through the utilization of data mining such as association, classification, and 
clustering techniques. It also specifies the means to identify their skills and assist them to improve their 
knowledge by predicting training programs. The study of [32] employed a visual exploratory discovery and 
analysis approach to determine the demand for jobs, skills specific to a domain or industry sector, and 
additional non-domain skills required to fill a job role. Meanwhile, the works of [33] explored the 
significance of key skills to employers, their perceptions of the availability of these skills in the labor force, 
and employer’s knowledge and use of key skills required for the job. They employed a combination of 
quantitative and qualitative data in the research. The quantitative data are mainly from surveys from 
employers where these employers were interviewed about key skills. Based on the result of the survey, 
around half of the respondents knew about core skills, however, only 41% of those aware of core skills were 
unable to name any specific skills. Employers were most likely to name skills related to basic skills, thus, a 
confusion between basic skills and key skills which can also be reflected in interviews with the employers. 
Only those employers who have direct links with education and training were aware of the key skills. Both 
quantitative and qualitative data illustrate the importance of key skills to employers. Zieglerl [34] concluded 
that skill requirements listed in online job ads can offer important insights on skill demand and skill wage 
differentials. In the light of the above statements, it is worthwhile to complement traditional skills research 
with a more flexible and dynamic approach to determine skill gaps. This study proposed a new methodology 
of retrieving and analyzing the content of job skills online advertisements. 

This paper analyzed words and word patterns of IT jobs published online in relation to the skills 
requirements as perceived that of the industry. This study helps to determine the actual and future needs and 
trends of IT jobs in the market. Furthermore, this will serve as a basis for curriculum enrichment and laid out 
the intervention program to address the gap between the skills acquired in the school and the IT industry skill 
needs. Furthermore, the results of the study could provide insights into the gap between the school's acquired 
skills and actual IT industry skill needs. It seeks to attain the following objectives: 

— To determine words/word pattern skills needs of the IT industry in the labor market based on online 
advertisement? 

— To analyze the words/word pattern skills needs of the IT industry in the labor market based on online 
advertisement? 

This paper is organized as shown in section 2 outlines the steps in this paper in the implementation 
of extracting information from online job vacancy sites and the machine learning algorithms employed in the 
study specifically association rules. Section 3 discuss the results of finding patterns and associations between 
job skills and job posted online. Finally, section 4 concludes the paper. 


2. RESEARCH METHOD 

This study was descriptive research in nature. Data ingestion was utilized to gather published job 
skills for IT professional as stated in CHED memo and ACM IT curricula. The procedure of collecting 
published job vacancy data to IT job skills requirement involves several steps as shown in Figure 1. 


2.1. Job information searching published job skills 

It starts with selecting the information source using google search using the keyword “information 
technology jobs Philippines” and Job-hunting sites in the Philippines like Job street, Kalibrrn and other job 
hunting sites. 


2.2. Data ingestion 

The identified job published entered into the data ingestion phase, which involves identifying job 
vacancies available in the source and downloading their content into an excel file. All information on the 
published job vacancies were transferred to an excel file. 


2.3. Information extraction and data cleaning 

The retrieved text from Job-hunting sites contain several HTML tags, unnecessary characters, non- 
textual characters, and web codes which were automatically stripped out using a modified program in PHP. 
In addition, data obtained from Job-hunting sites usually contain syntactic features, html code and entities 
like <> and which are embedded in the original sites. Thus, it was necessary to remove those contents from 
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the data because they might affect the result of sentiment classification and were not useful for the machine 
learning for sentiment analysis. Hence, a PHP application module was designed and developed was used for 
cleaning retrieve text from Job-hunting sites. The next step is the information extraction phase were the 
relevant content of the identified job skills were organized and classified. 


Job Information 


Searching IT Job Vacancies and 
Job-hunting sites in 















: Skills Requirements 
the Philippines Information e Classification 
1 Job Street.com Data Extraction 
2. Kalibrr Carrers Ingestion and Data Method 
3. Freelancer Cleaning e Association Rule 


4. Linkedin e FP-Grwoth 
5. Facebook 


6. Google 


Skill Words / Patterns highly related in a certain 
IT job requirement 


Figure 1. Research framework 


2.4. Skill and job classification 

Building a job classification that organized records into exclusive job groups-IT staff, network 
administrator, system analyst, computer programmer and database administrator. This is based on the 
primary job roles of BS IT graduate as stated in the commission higher education (CHED) memo 25 series of 
2015 and ACM IT curricula 2017. Appended the excel dataset with an additional attribute “Job” and 
manually assigned job title for each of the records as implied by its job skill based on the published job. This 
work is necessary to provide the algorithm with information about the skills needed for each job. 


2.5. Skills pattern recognition 
2.5.1. Pattern recognition process 

The term frequency-inverse document frequency (TF-IDF), schema was used to reflect the 
numbers/frequency of the important words. This schema was used as to determine by counting the number of 
occurances of job skill words in publicly available job websites. The number of occurances of skill terms in 
an online web pages weighted with a greater significant is the way used to discover the dominant skills words 
and skill patterns. The TF-IDF score increase in accordance to the frequency of times a word (skills) appears 
on an online job posting websites, but is countered by the word's frequency in the dataset, which helps to 
account for the fact that some words are more prevalent than others [35]. 


2.5.2. Association mining rules 

This stage presents a machine learning models to analyze skill words/skill word patterns or from a 
collection published jobs and their skills requirements by automatically extracting frequent words in each 
web site. Below we define and describe the association rules: 

Consider the following assumptions for representing the association rule in terms of mathematical 
representation, T={ wi,wi2, ... , im} be a set of items. Where skills Sc={ } s1,s2,..., sm , where each dataset si 
is a set of keywords such that t€A. Let Wi be a set of keywords. A skills ti is said to contain Wi if and only if 
Wi&ti. An association rule is an implication of the form Wi>W j where WicA, WjCA and WiN Wj =ọ. There 
are two important basic measures for association rules, support(s) and confidence(c). The rule Wi=>Wj has 
supports in the collection of tweets Tc if s% of tweets in Tc contain WiUWj. The formula for computing the 
support and confidence as shown in: 


Support of Wiwj 
Total number of Skill words 





Support(WiWj) = 


The rule Wi>W jholds in the collection of skills Tc with confidence c if among those skill words 
that contain W;, c % of them contain W j also. The confidence is calculated as shown in below: 
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Confidence =] = SUPPO, 
Wj Support(Wi) 
Support, which is the ratio of the number of instances when [w1 wj] appeared together in a single 
transaction to the total number of transactions, is used to quantify frequent item sets, whereas confidence is 
defined as the probability of finding [w1, wj] together [35]. 


2.5.3. Correlation of job skills requirements 
To determine the correlation between the words, Lift ratio is utilized in this study. 


_ Confidence(A v B) 
= Support (A) 


Lift 


If the value of lift rule is greater than one (1) then it has positive correlation. A lift value which is 
greater than one indicates pair of skill words appears more often together than expected. If the value of lift 
tule is less than one, then there is a negative correlation and the pair of skill words appears less often together 
than expected. If the value of lift rule is equal to one, then it is independent. A lift value of one indicates that 
the pair of skill words appear almost as often together as expected [35]. 


2.5.4. The frequent pattern growth fp-growth approach 

FP-growth is one of the most utilized association rule mining algorithms. The FP-growth algorithm 
utilized an analytical process that finds/locate frequent patterns/associations of job words from the dataset 
without generating the candidates [35], [36]. 


3. RESULTS AND DISCUSSION 

The aimed of this study is to find and analyzed the relationship between job skills word patterns and 
job posting online. The results revealed skills required for a certain IT job position. The results utilized skill 
(term) frequency and co-occurrence within each job posting. Further, the result representing domain word 
skills related to a particular job. 


3.1. Skills required for database administrator 

The association rule results in Table 1 reveal the relationship between job skill requirements for a 
database administrator. These were monitoring of databases, applications of a database, knowledge in SQL 
database, manage database technologies and knowledge in business. The said skill rules are the basic skill 
requirements for a database administrator based on the published IT job and job skill requirements which is 
also the industry-wide technical competencies under IT competencies model [18] 


Table 1. Discovered job skills required for database administrator 








Premises Conclusion Support Confidence Lift 
Monitoring Databases 0.285714 1 1.75 
Applications Databases 0.285714 1 1.75 
SQL Databases 0.285714 1 1.75 
Knowledge Business 0.285714 1 233 
Manage Technologies 0.285714 1 3.5 
Applications Software 0.285714 1 3.5 
experience, technical Solutions 0.285714 1 3.5 
technical, solutions Experience 0.285714 1 1.75 
business, intelligence Databases 0.285714 1 1.75 

1 


databases, software Applications 0.285714 





3.2. Skills required for a system analyst 

Table 2 reveal the relationships that year of experience in design, analysis of a system and 
experiences in SQL are the identified skill rules that a system analyst must possess. In addition, relevant and 
knowledge in the area of design, analysis, problem solving and SQL is also desired by employers in this IT 
job. This implies that the experience in system analysis and design is the main skill that a system analyst 
should possess. Lift value indicates that knowledge in relevant problem solving skills is the most needed 
skills by a system analyst based on the occurrences of the word skills in the job posting for system analyst. 
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Table 2. Discovered job skills required for system analyst 








Premises Conclusion Support Confidence Lift 
Equivalent Work 0.26 0.80 1.46 
Design Analysis 0.26 0.80 2.07 
years, analysis Design 0.26 0.80 2.48 
Design years, analysis 0.26 0.80 2.48 
SQL Years 0.26 1.00 1.72 
Solving Problem 0.29 1.00 3.44 
Relevant Knowledge 0.26 1.00 3.88 
work, equivalent Years 0.26 1.00 1.72 
analysis, design Years 0.26 1.00 1.72 





3.3. IT staff 

Table 3 shows the needed skills for IT staff is should be a graduate of information computer 
technology. Graduates of these courses acquired the set of foundational and employability skills, 
knowledge, and abilities that are required for all information worker employees. These are the 
universal skills-problem-solving and apply technical knowledge and tools effectively. This indicates 
that an IT staff needs to be a graduate of computer IT because has the basic needed IT knowledge and skills 
for IT staff. 


Table 3. Discovered job skills required for IT staff 








Premises Conclusion Support Confidence Lift 
Information, Computer Science 0.44 0.93 1.87 
Technology, Information, Computer Science 0.44 0.93 1.87 
Information, Science Technology, Computer 0.44 0.93 1.99 
Technology, Information, Science Computer 0.44 0.93 1.66 
Computer, Science Technology, Information 0.44 0.93 1.49 
Information Technology 0.63 1.00 1.60 
Technology, Computer, Science Information 0.44 1.00 1.60 
Information, Computer, Science Technology 0.44 1.00 1.60 





3.4. Computer programmer 

Table 4 reveals that a computer programmer must-have skill and knowledge in SQL database and 
knowledgeable in python, JavaScript and office as programming and application productivity tools. In 
addition, they should have skills in CSS, SK, and HTML basic skill requirements for a computer programmer 
based on the published IT job and skill requirements. Furthermore, skills such as software design and 
software engineering were included as additional skills required for a computer programmer. The analysis of 
[16] also indicates that job ads requesting IT job include HTML, Java, JavaScript, and MySQL skills. This 
indicates that aside from the programming skills they should have experience in software design and must 
have knowledge of software engineering. 


Table 4. Discovered job skills required for computer programmers 








Premises Conclusion Support Confidence Lift 
SQL Python 0.07 0.24 2.59 
SQL JavaScript 0.07 0.24 2.59 
SQL Office 0.08 0.29 3.62 

Software SP 0.07 0.36 3.02 

Software System 0.07 0.36 3.39 

Design Software 0.07 0.38 2.92 
Engineering Software 0.07 0.42 2.26 
Software Design 0.07 0.50 2.92 
SP Software 0.07 0.56 3.02 
CSS SP 0.07 0.56 4.69 
CSS HTML 0.07 0.56 5.28 





3.5. Network administrator 

Table 5 reveals the relationship between network administrator and job skill requirements. The 
result recognizes that network administrators must have a skill in experience network engineering, 
knowledge in networking, network troubleshooting, and CISCO network. The said skill rules are the needed 
skill requirements for a network administrator based on the published IT job and skill requirements. The skill 
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rules results can be explained with that skills/competencies required of a network administrator under the IT 
competencies model [18], [37] should be experienced in knowledge in networking, network engineering, 
CISCO and be able to troubleshoot networks. These are database-related technical skills that a database 
administrator should possess. 


Table 5. Discovered job skills required for network administrator 








Premises Conclusion Support Confidence Lift 
Engineering Network 0.24 0.80 1.46 
Engineering experience, network 0.24 0.80 1.77 

Network Experience 0.45 0.82 1.20 
Knowledge Network 0.27 0.82 1.50 

experience, Cisco Network 0.25 0.87 1.58 
Troubleshooting Network 0.27 0.88 1.59 
network, engineering Experience 0.24 1.00 1.46 





This study analyses job skills requirements based on online IT job vacancies posted online which 
provides information about IT job vacancies and skills requirements. The findings show the word skills 
association results enhance information extraction from job descriptions posted online. Gathering online 
published available data, with web and text mining tools, the study able to extract important facts about word 
patterns job skill requirement, and IT skills wanted by employers. The findings of this study have offset the 
limitation of the same study [18], [37] which provides the needed skills information of a certain job. 
Furthermore, the results provide current job skill requirements which are important in the revision and 
enhancement of the BSIT curriculum. 


4. CONCLUSION 

This study proposes a methodology for identifying and analyzing published job skills and IT job 
using frequency word occurrences of word skills as a requirement of the job. This proposed methodology is 
innovative in identifying required skills for a certain IT job that is posted online. Applying automated 
techniques, the proposed method will be able to retrieve and process large amounts of data posted online, and 
analyzed information about skills qualifications for a certain job. In addition, it provides direct, actionable 
information about skills demand that can be useful in planning and developing program curriculum. Thus, the 
results of the study also help the educational institution to understand the relationship between the posted job 
and the required skills/knowledge that need to be incorporated into their curriculum. It is therefore advisable, 
to identify demands in greater detail, and bridges the gap between skills needs and supply with more flexible 
program curriculum. Furthermore, Job and skill demand analysis is pertinent to the modern, data- and 
technology-dependent world, where skills and capabilities in a variety of industry sectors must be updated to 
cope with this new, invaluable source of knowledge. The future directions of the research study are to further 
explore other text mining tools and other visualization tools. There are many available tools and applications 
that can be tested for its information retrieval capabilities specifically in the area skill words and skill word 
patterns recognition, searching other potentially useful sources of data like web-based repositories such as 
online forums, blogs, and bulletin boards. 
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